Serveur d'exploration sur l'OCR

Attention, ce site est en cours de développement !
Attention, site généré par des moyens informatiques à partir de corpus bruts.
Les informations ne sont donc pas validées.

Optical character recognition : An illustrated guide to the frontier

Identifieur interne : 001E21 ( Main/Exploration ); précédent : 001E20; suivant : 001E22

Optical character recognition : An illustrated guide to the frontier

Auteurs : George Nagy (informaticien) [États-Unis] ; T. A. Nartker [États-Unis] ; S. V. Rice [États-Unis]

Source :

RBID : Pascal:01-0029148

Descripteurs français

English descriptors

Abstract

We offer a perspective on the performance of current OCR systems by illustrating and explaining actual OCR errors made by three commercial devices. After discussing briefly the character recognition abilities of humans and computers, we present illustrated examples of recognition errors. The top level of our taxonomy of the causes of errors consists of Imaging Defects, Similar Symbols, Punctuation, and Typography. The analysis of a series of "snippets" from this perspective provides insight into the strengths and weaknesses of current systems, and perhaps a road map to future progress. The examples were drawn from the large-scale tests conducted by the authors at the Information Science Research Institute of the University of Nevada, Las Vegas. By way of conclusion, we point to possible approaches for improving the accuracy of today's systems. The talk is based on our eponymous monograph, recently published in The Kluwer International Series in Engineering and Computer Science, Kluwer Academic Publishers, 1999.


Affiliations:


Links toward previous steps (curation, corpus...)


Le document en format XML

<record>
<TEI>
<teiHeader>
<fileDesc>
<titleStmt>
<title xml:lang="en" level="a">Optical character recognition : An illustrated guide to the frontier</title>
<author>
<name sortKey="Nagy, G" sort="Nagy, G" uniqKey="Nagy G" first="G." last="Nagy">George Nagy (informaticien)</name>
<affiliation wicri:level="2">
<inist:fA14 i1="01">
<s1>Dept. of Electrical, Computer, and Systems Engineering, Rensselaer Polytechnic Institute</s1>
<s2>Troy, NY 12180</s2>
<s3>USA</s3>
<sZ>1 aut.</sZ>
</inist:fA14>
<country>États-Unis</country>
<placeName>
<region type="state">État de New York</region>
</placeName>
<placeName>
<settlement type="city">Troy (New York</settlement>
<region type="state">État de New York</region>
</placeName>
<orgName type="lab" n="5">Institut polytechnique Rensselaer</orgName>
</affiliation>
</author>
<author>
<name sortKey="Nartker, T A" sort="Nartker, T A" uniqKey="Nartker T" first="T. A." last="Nartker">T. A. Nartker</name>
<affiliation wicri:level="2">
<inist:fA14 i1="02">
<s1>Dept. of Computer Science, University of Nevada</s1>
<s2>Las Vegas, NV 89154</s2>
<s3>USA</s3>
<sZ>2 aut.</sZ>
</inist:fA14>
<country>États-Unis</country>
<placeName>
<region type="state">Nevada</region>
</placeName>
</affiliation>
</author>
<author>
<name sortKey="Rice, S V" sort="Rice, S V" uniqKey="Rice S" first="S. V." last="Rice">S. V. Rice</name>
<affiliation wicri:level="2">
<inist:fA14 i1="03">
<s1>Comparisonics Corporation</s1>
<s2>Grass Valley, CA 95945</s2>
<s3>USA</s3>
<sZ>3 aut.</sZ>
</inist:fA14>
<country>États-Unis</country>
<placeName>
<region type="state">Californie</region>
</placeName>
</affiliation>
</author>
</titleStmt>
<publicationStmt>
<idno type="wicri:source">INIST</idno>
<idno type="inist">01-0029148</idno>
<date when="2000">2000</date>
<idno type="stanalyst">PASCAL 01-0029148 INIST</idno>
<idno type="RBID">Pascal:01-0029148</idno>
<idno type="wicri:Area/PascalFrancis/Corpus">000742</idno>
<idno type="wicri:Area/PascalFrancis/Curation">000051</idno>
<idno type="wicri:Area/PascalFrancis/Checkpoint">000714</idno>
<idno type="wicri:doubleKey">1017-2653:2000:Nagy G:optical:character:recognition</idno>
<idno type="wicri:Area/Main/Merge">001F26</idno>
<idno type="wicri:Area/Main/Curation">001E21</idno>
<idno type="wicri:Area/Main/Exploration">001E21</idno>
</publicationStmt>
<sourceDesc>
<biblStruct>
<analytic>
<title xml:lang="en" level="a">Optical character recognition : An illustrated guide to the frontier</title>
<author>
<name sortKey="Nagy, G" sort="Nagy, G" uniqKey="Nagy G" first="G." last="Nagy">George Nagy (informaticien)</name>
<affiliation wicri:level="2">
<inist:fA14 i1="01">
<s1>Dept. of Electrical, Computer, and Systems Engineering, Rensselaer Polytechnic Institute</s1>
<s2>Troy, NY 12180</s2>
<s3>USA</s3>
<sZ>1 aut.</sZ>
</inist:fA14>
<country>États-Unis</country>
<placeName>
<region type="state">État de New York</region>
</placeName>
<placeName>
<settlement type="city">Troy (New York</settlement>
<region type="state">État de New York</region>
</placeName>
<orgName type="lab" n="5">Institut polytechnique Rensselaer</orgName>
</affiliation>
</author>
<author>
<name sortKey="Nartker, T A" sort="Nartker, T A" uniqKey="Nartker T" first="T. A." last="Nartker">T. A. Nartker</name>
<affiliation wicri:level="2">
<inist:fA14 i1="02">
<s1>Dept. of Computer Science, University of Nevada</s1>
<s2>Las Vegas, NV 89154</s2>
<s3>USA</s3>
<sZ>2 aut.</sZ>
</inist:fA14>
<country>États-Unis</country>
<placeName>
<region type="state">Nevada</region>
</placeName>
</affiliation>
</author>
<author>
<name sortKey="Rice, S V" sort="Rice, S V" uniqKey="Rice S" first="S. V." last="Rice">S. V. Rice</name>
<affiliation wicri:level="2">
<inist:fA14 i1="03">
<s1>Comparisonics Corporation</s1>
<s2>Grass Valley, CA 95945</s2>
<s3>USA</s3>
<sZ>3 aut.</sZ>
</inist:fA14>
<country>États-Unis</country>
<placeName>
<region type="state">Californie</region>
</placeName>
</affiliation>
</author>
</analytic>
<series>
<title level="j" type="main">SPIE proceedings series</title>
<idno type="ISSN">1017-2653</idno>
<imprint>
<date when="2000">2000</date>
</imprint>
</series>
</biblStruct>
</sourceDesc>
<seriesStmt>
<title level="j" type="main">SPIE proceedings series</title>
<idno type="ISSN">1017-2653</idno>
</seriesStmt>
</fileDesc>
<profileDesc>
<textClass>
<keywords scheme="KwdEn" xml:lang="en">
<term>Cause</term>
<term>Error</term>
<term>Improvement</term>
<term>Optical character recognition</term>
<term>System evaluation</term>
<term>Task difficulty</term>
<term>Typology</term>
</keywords>
<keywords scheme="Pascal" xml:lang="fr">
<term>Reconnaissance optique caractère</term>
<term>Evaluation système</term>
<term>Erreur</term>
<term>Typologie</term>
<term>Amélioration</term>
<term>Difficulté tâche</term>
<term>Cause</term>
<term>ISRI (Information Science Research Institute)</term>
</keywords>
</textClass>
</profileDesc>
</teiHeader>
<front>
<div type="abstract" xml:lang="en">We offer a perspective on the performance of current OCR systems by illustrating and explaining actual OCR errors made by three commercial devices. After discussing briefly the character recognition abilities of humans and computers, we present illustrated examples of recognition errors. The top level of our taxonomy of the causes of errors consists of Imaging Defects, Similar Symbols, Punctuation, and Typography. The analysis of a series of "snippets" from this perspective provides insight into the strengths and weaknesses of current systems, and perhaps a road map to future progress. The examples were drawn from the large-scale tests conducted by the authors at the Information Science Research Institute of the University of Nevada, Las Vegas. By way of conclusion, we point to possible approaches for improving the accuracy of today's systems. The talk is based on our eponymous monograph, recently published in The Kluwer International Series in Engineering and Computer Science, Kluwer Academic Publishers, 1999.</div>
</front>
</TEI>
<affiliations>
<list>
<country>
<li>États-Unis</li>
</country>
<region>
<li>Californie</li>
<li>Nevada</li>
<li>État de New York</li>
</region>
<settlement>
<li>Troy (New York</li>
</settlement>
<orgName>
<li>Institut polytechnique Rensselaer</li>
</orgName>
</list>
<tree>
<country name="États-Unis">
<region name="État de New York">
<name sortKey="Nagy, G" sort="Nagy, G" uniqKey="Nagy G" first="G." last="Nagy">George Nagy (informaticien)</name>
</region>
<name sortKey="Nartker, T A" sort="Nartker, T A" uniqKey="Nartker T" first="T. A." last="Nartker">T. A. Nartker</name>
<name sortKey="Rice, S V" sort="Rice, S V" uniqKey="Rice S" first="S. V." last="Rice">S. V. Rice</name>
</country>
</tree>
</affiliations>
</record>

Pour manipuler ce document sous Unix (Dilib)

EXPLOR_STEP=$WICRI_ROOT/Ticri/CIDE/explor/OcrV1/Data/Main/Exploration
HfdSelect -h $EXPLOR_STEP/biblio.hfd -nk 001E21 | SxmlIndent | more

Ou

HfdSelect -h $EXPLOR_AREA/Data/Main/Exploration/biblio.hfd -nk 001E21 | SxmlIndent | more

Pour mettre un lien sur cette page dans le réseau Wicri

{{Explor lien
   |wiki=    Ticri/CIDE
   |area=    OcrV1
   |flux=    Main
   |étape=   Exploration
   |type=    RBID
   |clé=     Pascal:01-0029148
   |texte=   Optical character recognition : An illustrated guide to the frontier
}}

Wicri

This area was generated with Dilib version V0.6.32.
Data generation: Sat Nov 11 16:53:45 2017. Site generation: Mon Mar 11 23:15:16 2024